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The main objective of stock market investors is to maximize their gains. As 
a result, stock price forecasting has not lost interest in recent decades. 
Nevertheless, stock prices are influenced by news, rumor, and various 
economic factors. Moreover, the characteristics of specific stock markets can 
differ significantly between countries and regions, based on size, liquidity, 
and regulations. Accordingly, it is difficult to predict stock prices that are 
volatile and noisy. This paper presents a hybrid model combining singular 
spectrum analysis (SSA) and nonlinear autoregressive neural network 
(NARNN) to forecast close prices of stocks. The model starts by applying 
the SSA to decompose the price series into various components. Each 
component is then used to train a NARNN for future price forecasting. In 
comparison to the autoregressive integrated moving average (ARIMA) and 
NARNN models, the SSA-NARNN model performs better, demonstrating 
the effectiveness of SSA in extracting hidden information and reducing the 


Stock price prediction noise of price series. 
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1. INTRODUCTION 

A stock exchange is a legitimate organization that provides chances for investing in firms by 
purchasing or selling their listed shares [1]. The stock market functions similar to other economic markets, 
i.e., buyers aim to pay the lowest possible price for the stock, while sellers aim for higher prices. Investing in 
stock markets yields considerable gains, making it more appealing than low-yielding assets such as 
government bonds. Moreover, the high liquidity present in the stock market allows the investors to transfer 
the assets into cash quickly [2]. Nevertheless, only a few people engage in stock trading due to the challenges 
of forecasting stock prices, which increases investment risk. 

During the last decade, analytical and computational methods have advanced and given rise to 
several innovative new approaches to analysis financial-time series based on nonlinear and nonstationary 
models [3]. Machine learning models have been effectively employed in various sectors, including the stock 
market. According to the literature, artificial neural network models (ANNs) are the main machine learning 
method used in forecasting various financial markets [4]-[7]. ANN is a proper tool in forecasting stock prices 
because it does not imply any prior assumptions and can grasp nonlinear functions regarding the data 
properties [6]. ANNs are classified as static or dynamic networks. For instance, dynamic neural networks, 
e.g., the nonlinear autoregressive neural network (NARNN), assess the output utilizing a number of its 
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preceding inputs. Thus, dynamic networks possess a memory that supports recognizing time-varying 
characteristics [8]. Unfortunately, the high noise level and interaction between hidden features in the price 
series decrease the ANNs prediction efficiency [9]-[11]. Therefore, a data preprocessing technique, e.g., 
singular spectrum analysis or multicomponent amplitude and frequency modulated (AM—FM) models 
[12]-[15], is needed to improve the prediction accuracy via noise reduction and extracting underlying 
information hidden in the price series. 

The singular spectrum analysis (SSA) has a variety of applications, including signal denoising, trend 
extraction, and forecasting [16]. Hassani et al. [17] employed the SSA to forecast the GBP/USD exchange 
rates. The authors found that the SSA model performed better than the random walk model. Fenghua et al. 
[18] employed the SSA for decomposing stock prices into a trend, fluctuation, and noise terms. Then, the 
support vector machine (SVM) is used to predict each term. Afterward, Abdollahzade et al. [19] proposed a 
model integrating the neuro-fuzzy with SSA to predict nonlinear chaotic time series. The authors concluded 
that the SSA improved the prediction performance due to noise reduction in the original time series. Lahmiri 
[20] presented a hybrid forecasting that integrates SSA and SVM optimized by particle swarm. The 
performance of the proposed model was evaluated with intraday stock prices, and results indicated its 
promise for predicting noisy time series. Later, Xiao et al. [21] forecasted the Shanghai composite index 
using the SSA-SVM model established by [18]. The authors compared the SSA with the empirical mode 
decomposition and found that the former yields better forecasts. Recently, Sulandari et al. [22] integrated the 
SSA and ANN to forecast time series and found that the SSA-ANN model outperformed the ANN model. 

This paper presents a hybrid model combining SSA and NARNN for stock price forecasting. First, 
the model divides the weekly stock closing prices into training and testing sets. Second, the SSA decomposes 
the training set into various components to extract hidden features and decrease the noise. Third, a NARNN 
is constructed and trained for each decomposed component. Forth, the model predicts the future values of 
various components by decomposing the preceding available prices. Finally, the SSA-NARNN model 
aggregates the predicted values to obtain the final output. Therefore, these procedures simulate the real 
trading process and avoid inserting any information regarding the stock future performance in the training 
process. The suggested model's reliability is demonstrated using the weekly closing prices of twenty-four 
stocks listed on the Egyptian Exchange. Additionally, the superiority of the SSA-NARNN model is 
demonstrated by comparison to the autoregressive integrated moving average (ARIMA) and the single 
NARNN model. 


2. RESEARCH METHODS 
2.1. Singular spectrum analysis (SSA) 

The SSA decomposes signals using singular value decomposition (SVD) and generates singular 
values containing information regarding the original time series [23]. The SSA approach is divided into two 
stages, decomposition and reconstruction, each of which consists of two steps [24]. This section outlines the 
SSA's stages. 


2.1.1. Decomposition 

The decomposition is performed in two steps: embedding and SVD. Embedding is the first step in 
which the price series is converted to a lagged trajectory matrix [24]. If s, = [s,,52,...,Sy |’ is a time series 
of length N, the mapped trajectory matrix, M, is defined as [25], [26]: 


Sy S2 eee SK 
S Ss eee Ss 

MS (Mie Me eae ce EY 8 EEE (1) 
Sp, St4ia °° SN 


where L is the embedding dimension and satisfies 2<L<N. M is a Hankel matrix. For instance, a trajectory 
matrix of a price series, S=[1.90, 1.62, 1.55, 1.60, 1.56, 1.47, 1.51, 1.48, 1.66], for L=5 is formulated as: 


1.90 1.62 1.55 1.60 1.56 
1.62 155 1.60 1.56 1.47 
M=)/1.55 160 1.56 1.47 1.51 
1.60 1.56 1.47 1.51 1.48 
1.56 147 1.51 148 1.66 


The SVD is used after the embedding step for factorizing the trajectory matrix into biorthogonal 
elementary matrices [24]. This procedure is denoted by [27]: 
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M =YE.M,=U2V" = Yh ou, of (2) 
where M,. is the r’” elementary matrix, U and V are orthonormal system, X is a diagonal matrix whose 
diagonal elements, o,., are the singular values of MM7. 


2.1.2. Reconstruction 

The second stage of the SSA is reconstruction, which involves projecting the time series onto data- 
adaptive eigenvectors to decrease the dataset dimensionality and express it in an optimum subspace [28]. The 
reconstruction stage includes two processes: grouping and diagonal averaging. Grouping is the process of 
categorizing elementary matrices into groups based on their eigentriples. Then, the matrices within each 
group are then added together [16], [24]. Let g = {i,.ig.....i,} represents a group of n chosen eigentriples; 
then, the matrix M, for group g is written: 


Mg = Mi, + Mi, + aad + Mi, (3) 


Splitting of indices set, r = J, ..., R, into m subsets, g;, g2..., Zm, renders the original mapped 
trajectory matrix as: 


M=Mg, + Mg, +--+ Mg, (4) 


The contribution of a component, M,z, in the trajectory matrix is represented by the ratio of its eigenvalues as 
R 
Dieg Ai [Xrsi Ar 
The second step in the reconstruction process is the diagonal averaging along the N antidiagonal of 
the matrix M,, also known as the Hankelization process H(M,). This process converts the matrix into a time 
series that is a component of the original series s;,. If Xgij is an element of M,, then the k” term of the 


reconstructed time-series is obtained through averaging the elements X gi; that satisfies i+j=k+/, where /<k 


<N. The reconstructed time-series will have a length N. The application of the diagonal averaging process to 
all the terms of (4) renders the decomposed components of the original series as: 


Se = H(Mg,) + H(Mg,) poe H(Mg,,) (5) 


2.2. Nonlinear autoregressive neural network (NARNN) 

The NARNN is a feed-forward dynamic network that forecasts future values of a time series by 
using its previous d values [29]. Hence, the NARNN utilizes the time series’ past behavior to predict its 
future behavior. Figure | illustrates the NARNN's structure utilized in our work. The network includes an 
input layer with a time delay line (TDL), two hidden layers, and an output layer. The TDL is configured with 
an eight-week feedback delay, which means that the previous eight closing prices are utilized to forecast the 
ninth closing price. The first and second hidden layers' activation functions are tan-sigmoid and log-sigmoid, 
respectively. Additionally, the output layer has a single neuron with linear activation. The levenberg- 
marquardt backpropagation (LMBP) algorithm is used in the network learning process. 


Input Hidden Layer 1 Hidden Layer 2 Output Layer 


S(t) 


Figure 1. A typical NARNN structure 
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2.3. The hybrid SSA-NARNN model 

The SSA-NARNN model combines the SSA and NARNN models to forecast stock prices. First, the 
model employs the SSA to decompose the price series into separate components, as discussed in section 2.1. 
Then, the increment of singular entropy, 4F represented by (6), is utilized to separate the noise component. 
When AE reaches an asymptotic value, the information in the time series is extracted, and the remaining 
components represents the noise term. 


Xi dai 
ag = -( i ) lo ( i ) 6 
ae hl ae (©) 


The covariance and standard deviation, o, of two components M,, and Mg,, is utilized to calculate 


the correlation coefficient, p, as: 


_ cov(Mg,- Mg) 
~ 01 92 (7) 


components with p=0.4 are considered linearly dependent and aggregated. The hybrid SSA-NARNN model's 

steps are summarized: 

a) Divide the stock prices into a training and testing sets, 70% and 30%, respectively. 

b) Decompose the training set using SSA with L=/4 [30]. 

c) Calculate AEF for all singular values. 

d) Determine the number of orders at which AF attains an asymptotic value, then group the following 
elementary matrices into the noise term. 

e) Reconstruct the different components. 

f) Aggregate components with p > 0.4. 

g) Create and train a NARNN for each with the structure illustrated in Figure 1. 

h) Predict the price of each point in the testing dataset: 

—  Decompose the previous prices in the manner described in steps 2-7. 

— Predict one step of each component. 

— Add the predicted values to obtain the final price. 


3. RESULTS AND DISCUSSION 

The stock market data used in this research involves twenty-four stocks listed on the EGX-30 index 
with five-year historical data from January 2016 to December 2020. The following are the seven economic 
sectors of the twenty-four stocks involved in our analysis: i) basic resources, 11) non-bank financial services, 
iii) banks, iv) textile and durables, v) real estate, vi) food, beverages a d tobacco, and vii) industrial goods, 
services & automobiles. The proposed SSA-NARNN model is utilized to forecast the weekly closing prices 
of the twenty-four stocks involved in our analysis. First, a sensitivity analysis is carried to determine the 
optimum number of neurons in hidden layers of the NARNN. Figure 2 shows the average calculated mean 
absolute error (MAPE) of the predicted stocks versus the number of neurons in NARNN hidden layers. The 
lowest average MAPE is obtained using seven and nine neurons in the first and second hidden layers. 


MAPE (%) 


[7,7] [7,8] [7,9] [7,10] [8,7] [8,8] [8,9] [8,10] [9,7] [9,8] [9,9] [9,10] [10,7] [10,8] [10,9] [10,10] 


Number of neurons in Hidden Layers [First, Second] 


Figure 2. Sensitivity analysis of the model performance to the number of neurons in NARNN hidden layers 


For demonstration, the SSA decomposition process of the COMI, the heaviest constituent of the 
EGX-30, is detailed. The first 70% of the available 260 trading weeks are chosen for training. The training 
data is decomposed into 14 elementary matrices, then the increment of singular entropy is determined. As 
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illustrated in Figure 3, the increment of singular entropy saturates at the sixth order. Thus, the noise term is 
formed by combining elementary matrices from 7 to 14. 

The next step is to reconstruct the various matrices and calculate the correlation coefficient, as 
indicated in Table 1. The first and second reconstructed components have a correlation coefficient of 0.298, 
showing that they are separable. Likewise, components six and seven are separable. On the other hand, 
components two through six have higher correlation coefficients, p>0.4, and are thus integrated into the one 
component. Figure 4 shows the reconstructed components, COMI's training set. RC1, RC2, and RC3 


represent the market trend, fluctuation, and noise, respectively. 


AE (x10) 


0123 4 5 6 7 8 9 10 11 12 13 14 


Number of order 


Figure 3. Increment of the singular entropy 


Table 1. Correlation coefficients matrix of the seven reconstructed matrices 


Reconstructed Matrix 1 2 3 4 5 6 7 
1 1 0.298 -0.025 0.014 -0.001 -0.026 -0.012 
2 0.298 1 0.421 0.057 0.067 0.009 0.005 
3 -0.025 0.421 1 0.549 0.092 0.125 0.032 
4 0.014 0.057 0.549 1 0.552 0.146 0.024 
5 -0.001 0.067 0.092 0.552 1 0.489 0.090 
6 -0.026 0.009 0.125 0.146 0.489 1 0.289 
7 -0.012 0.005 0.032 0.024 0.090 0.289 1 
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Figure 4. COMI training prices and the three reconstructed components using SSA 
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Each reconstructed component is utilized for training a NARNN. Then, the latest eight decomposed 
points are then supplied to the trained NARNNs in order to forecast the components of the first point in the 
testing dataset. To predict the components of the second point, the SSA is used to decompose all previous 
prices, i.e., the training set plus the first testing point. The NARNNs are then fed the most recent eight weeks' 
decomposed data. This procedure is repeated until all of the points in the testing set have been predicted. 
Finally, the predicted decomposed components are then aggregated to obtain the weekly closing price using 
the SSA-NARNN model. Figure 5 compares the predicted versus actual weekly closing prices of the COMI's 
testing dataset. 

In order to analyze the new SSA-NARNN model's performance and prove its effectiveness, we 
compared it with the ARIMA and single NARNN models. Table 2 shows each model's measured evaluation 
criteria, with the best results bolded. The results demonstrated that the suggested SSA-NARNN outperforms 
the ARIMA model and the single NARNN without data preprocessing. Owing to noise and nonstationary 
reduction in the price data, the SSA increased the system's learning and generalization abilities. From this 
analysis, the proposed SSA-NARNN model proved its ability to predict stock prices in financial markets. 


—s- Actual data 


—*-Predicted data 


Value, EGP 


Figure 5. SSA-NARNN model's predicted weekly closing prices of the COMI stock 


Table 2. Evaluation of SSA-NARNN and NARNN performance for the twenty-four stocks 


Stock RMSE MAPE 
ARIMA NARNN SSA-NARNN ARIMA NARNN SSA-NARNN 
ABUK 2.45 2.95 1.51 10.2 12.3 6.8 
AMOC 0.52 0.43 0.29 13.2 10.4 iP) 
ESRS 0.97 1.92 0.89 8.8 19.7 8 
SKPC 1.34 3.47 1.04 12.2 43.5 10.1 
CCAP 0.29 0.36 0.22 12.1 15.5 8.8 
EKHO 0.07 0.06 0.05 4.3 3.6 3.1 
HRHO 1.26 1.21 0.94 6.8 6.5 5.2 
OIH 0.08 0.13 0.06 14.1 23.1 10.5 
PIOH 0.53 1.92 0.41 11.8 44.3 8.5 
COMI 5.21 6.25 3.13 5.2 6.2 3.6 
CIEB 2.73 4.12 2.31 6.3 11.1 4.8 
EXPA 0.75 0.75 0.73 5.8 5.8 4.8 
ORWE 0.41 0.52 0.33 4.8 6.8 3.4 
EMFD 0.18 0.12 0.11 4.3 3.3 3.2 
HELI 0.56 0.46 0.45 7.8 6 5.9 
MNHD 1.12 1.09 0.73 27.6 27 18.5, 
OCDI 1.23 1 0.92 77 6.5 5.9 
ORHD 0.59 0.51 0.5 12.5 9.4 9.3 
PHDC 0.31 0.79 0.19 16.6 52.1 11.4 
TMGH 0.63 0.52 0.44 6.9 6 5.1 
EAST 0.86 0.79 0.73 5.9 4.9 4 
EFID 0.73 0.68 0.62 5.6 5 4.1 
AUTO 0.37 0.33 0.26 10.4 9.3 77 
SWDY 0.81 0.74 0.66 73 5.9 aH 


4. CONCLUSION 

With regards to the importance of stock market prediction and the difficulties associated with it, 
researchers are constantly attempting new methods to examine these markets. NARNN is a machine learning 
model that contains a TDL, which lessens short-term volatility. Also, the SSA is among the effective data 
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preprocessing techniques that have recently been considered to reduce the noise and extract hidden 
information from time series. Therefore, this paper introduces a new model, i.e., the SSA-NARNN model, to 
forecast stock prices. Using SSA, the model first decomposes the financial time series into various 
components. The decomposed components are then supplied into the NARNN, which fades the short-term 
volatility using the eight preceding timesteps. The model performance is validated with twenty-four stocks 
listed on the Egyptian Exchange. Results indicate that the SSA increased the NARNN's learning and 
generalization, and the SSA-NARNN model outperforms the ARIMA and NARNN models. This study 
recommends the following for future research: i) developing a decision-based trading strategy using the 
proposed SSA-NARNN model to provide buy and sell signals and ii) integrating the price prediction model 
with other intelligent models to build portfolios with higher returns. 
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